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Abstract 

In forensics it is a classical problem to determine, when a suspect S 
shares a property T with a criminal C, the probability that S = C. In 
this paper we give a detailed account of this problem in various degrees 
of generality. We start with the classical case where the probability 
of having T, as well as the a priori probability of being the criminal, 
is the same for all individuals. We then generalize the solution to 
deal with heterogeneous populations, biased search procedures for the 
suspect, T-correlations, uncertainty about the subpopulation of the 
criminal and the suspect, and uncertainty about the T-frequencies. 
We also consider the effect of the way the search for S is conducted, 
in particular when this is done by a database search. A returning 
theme is that we show that conditioning is of importance when one 
wants to quantify the "weight" of the evidence by a likelihood ratio. 
Apart from these mathematical issues, we also discuss the practical 
problems in applying these issues to the legal process. The posterior 
probabilities of C = S are typically the same for all reasonable choices 
of the hypotheses, but this is not the whole story. The legal process 
might force one to dismiss certain hypotheses, for instance when the 
relevant likelihood ratio depends on prior probabilities. We discuss 
this and related issues as well. As such, the paper is relevant both 
from a theoretical and from an applied point of view. 
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1 Introduction 



In 1968, a couple stood to trial in a notorious case, known as "People of the 
State of California vs. Collins" . The pair had been arrested since it matched 
eye-witness descriptions. It was estimated by the prosecution that only one 
in twelve million couples would match this description. The jury were invited 
to consider the probability that the accused pair were innocent and returned 
a verdict of guilty. 

Later, the verdict was overthrown, essentially because of the flaws in the 
statistical reasoning. The case sparked interest in the abstraction of this 
problem, which became known as the island problem, following terminology 
introduced by Eggleston [I]. Its formulation is the following. A crime has 
been committed by an unknown member of a population of iV + 1 individuals. 
It is known that the criminal has a certain property T. Each individual has 
T (independently) with probability p. A random member of the population 
is tested and observed to have T. What is the probability that it is the 
criminal? 

This problem has been quite extensively studied in the literature. For 
example, Balding and Donnelly [TJ give a detailed account of the island prob- 
lem as well as of its generalization to inhomogeneous populations or (alterna- 
tively) uncertainty about p. They also discuss the effects of a database search 
or a sequential search (i.e., a search which stops when the first T-bearer is 
found). Dawid and Mortera have studied the generalization of the island 
problem to the case where the evidence may be unreliable [21 E]. 

The current paper is expository in the sense that some of the above men- 
tioned results are reproduced - albeit presented in a somewhat different way 
- and a research article in the sense that we consider generalizations which 
to our knowledge have not appeared elsewhere. Apart from the expository 
versus research nature, there is another duality in this paper, namely the 
distinction between the purely mathematical view versus a more applied 
viewpoint, and we elaborate on this issue first. 

Most texts focus on the "likelihood ratio", the quantity that transforms 
"prior" odds of guilt, that is, before seeing the evidence, into "posterior" 
odds after seeing the evidence. There is good reason to do so. Indeed, the 
likelihood ratio is often viewed as the weight of the evidence - it is therefore 
the quantity of interest for a forensic lab, which is unable or not allowed to 
compute prior (or posterior, for that matter) odds, this being the domain 
of the court. However, this already implies a first question. Which part of 
the available data should be seen as the evidence, and which part is "just" 
background information? In other words: which evidence do we consider and 
what is the context? Indeed, the weight of the evidence, that is, the value of 
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the likelihood ratio, sometimes depends on which of the available information 
is regarded as background information or as evidence (and of course also on 
the propositions that one is interested in proving). From a purely mathe- 
matical point of view, concentrating on the "posterior probabilities" , that is, 
the probability that a suspect is guilty, given background information and/or 
evidence, settles the issue. Indeed, it is well known ([7]) that the posterior 
probabilities are invariant under different choices of the hypotheses as long 
as they are "conditionally equivalent given the data". Hence, from a purely 
mathematical point of view, the situation is quite clear, and one should con- 
centrate on the posterior probabilities rather than on the likelihood ratios. 

However, from a legal perspective things are not so simple. The likelihood 
ratio is, as mentioned earlier, supposed to be in the domain of the statistical 
expert, but what if this likelihood ratio involves prior probabilities itself? 
We will see concrete examples of this in this article, and in these cases the 
classical point of view (likelihood ratio is for the expert, the rest is for the 
court) does not seem to immediately apply. If we have the choice among 
various likelihood ratios, are there reasons to prefer one over the other? Also 
this question will be addressed in particular cases in this paper. 

For the island problem, the above discussion is relevant as soon as the 
population has subpopulations, each with their own T-frequency. In that 
case, considering the information that the criminal has F as information on 
the one hand or as evidence on the other, leads to different likelihood ra- 
tios, but the posterior odds are (of course) the same. We will go into this 
phenomenon in detail, considering subpopulations simultaneously with un- 
certainty about to which subpopulation the criminal and the suspect belong, 
together with uncertainty about the T-frequencies in each of the subpopula- 
tions. Another possibility which we will consider is that of T-correlation or a 
biased search (i.e., the choice of suspect depends on the true identity of the 
criminal). 

The outline of this paper is as follows. In Section 2, we review the clas- 
sical island problem. We then consider in Section 3 the effect of having a 
biased search protocol, and of having T-correlations; we show that these two 
different types of having dependencies are strongly related to each other. In 
Section 4, we treat the case where the population is a disjoint union of sub- 
populations, each with their own T-frequency and prior probability of having 
issued the criminal. In Section 5, we consider the effect of uncertainty of the 
T-frequencies, both in a homogeneous and heterogeneous population. In ad- 
dition, we investigate the effect on the likelihood ratio of uncertainty about 
the criminal's and the suspect's subpopulations. Section 6 deals with the 
case in which a suspect is found through a match in a database. Finally, in 
Section 7 we present a significant number of numerical examples. 
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We have tried to include all details of the computations, but at the same 
time to state our conclusions in a non-technical and accessible way. Our main 
conclusions can be recognized in the text as bulleted (•) lists. As such, we 
hope that our contribution is interesting and useful both for mathematicians, 
forensic scientists and legal representatives. 



2 The classical case 

Our starting point is a collection X of iV + 1 individuals. All forthcom- 
ing random variables are defined on a (non-specified) probability space with 
probability measure P. The random variables C and S take values in X and 
represent the criminal and the suspect respectively. Furthermore, we have a 
characteristic T, for which we introduce indicator random variables T x , taking 
value 1 if x G X has the characteristic T and otherwise. The T x are inde- 
pendent of (S, C) in the sense that P(T X = 1 | C — y, S — z) — P(T X = 1) 
for all x, y, z. The number of T-bearers is written as U = Ylxex 
We are primarily interested in the conditional probability 

P(C = s | S = s,Tc = T s = l); 

often we follow the habit of stating the so-called posterior odds in favour of 
guilt, that is, 

P(C = s\S = s,T c = T a = l) 

P(C^s\S = s,T c = T s = iy 1 ' ' 

Since we will often be working conditional on {S = s} we introduce the 
notation 

P s (.) = P(.\S = s). 

We define the events / := {T c = 1}, G := {S = C}, E := {T s = 1}, 
E x := {T x = 1} and G x = {x = C}. We will sometimes refer to the 
event / (or similar events) as "information", and to E (or similar events) 
as "evidence"; this is just colloquial use of language, and sometimes we will 
view I as part of the evidence. 
With this notation, (12. ip reads 

P S (G\I,E) 

p s (G*\i,Ey 

which can be rewritten in two different ways, namely 

P S (G\I,E) P S (E\G,I) P S (G\I) 



P a (G°\I,E) P S {E\G-,I) P S {G°\I) 



(2.2) 
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or 



P S (G\I,E) P a (I,E\G) P a (G) 
P S (G C \I,E) P S (I,E\G C )' P S (G C )' 



(2.3) 



The left hand side of these equations is called the posterior odds. In ( 12. 2 p , we 
arrive at the posterior odds by "starting out" with background information 
I via the quotient P S (G \ I)/P S (G C \ I), called the prior odds. These prior 
odds are transformed into the posterior odds by multiplication with P S (E | 
G,I)/P a (E I G C ,I). This latter quotient is called the likelihood ratio and 
is supposed to be a measure of the strength of the evidence E. On the 
other hand, in (12.31) we "start out" from prior odds P S (G) / P S {G C ) , that is, 
we interpreted both / and E as evidence. The likelihood ratio in that case 
is P S (I,E | G)/P S (I,E | G c ) and measures the "combined" strength of the 
evidence I and E. 

In this section treating the classical case, we assume that C and S are 
independent and that C is uniformly distributed on X. Furthermore, the T x 
are independent and identically Bernoulli distributed with success probability 
p. These assumptions are not without problems when applied to concrete 
legal cases. The assumption that C is uniformly distributed means that we a 
priori regard each member of the population equally likely to be the criminal. 
It is probably the case that computations based on this assumption cannot be 
used as legal evidence. However, many of the computations below can also 
be done with other choices for the distribution of C . Having a particular 
choice in mind does allow us to compare various formulas in a meaningful 
way. The independence and equidistribution of the T x will be relaxed later on 
in this paper, in various ways: one can consider subpopulations with different 
frequencies, allow dependencies between the Y x or incorporate uncertainty in 
the probability p. Also the independence between C and S will be relaxed 
later on. 

The outcomes in the current section do not depend on the particular s 
we condition on, but for the sake of consistency, we do write P s instead of P. 
We abbreviate E := Eg- The independence between S and C now implies 
that P S (G) = 1/(N + 1). Both likelihood ratios in (J2~2D and (ESD are equal 
to 1/p. It easily follows that 



In this case it does not really matter which viewpoint one takes: the likelihood 
is a function of p alone, and does not involve any prior knowledge. Of course, 
as mentioned before, in a legal setting it is not clear that uniform priors are 
acceptable or useful, and starting from other prior probabilities is of course 
possible in this framework. 



P S (G\I,E) _ 1 P S (G | /) _ 1 P a (G) J_ 
P S (G C \I,E) p'P s (G c \I) p P S (G C ) Np 



(2.4) 



5 



In the next two subsections we will examine (for the classical case) how 
P S (G | I,E) is related to the random variable U. It turns out that we may 
express P S {G \ I,E) both as the inverse of the expectation of U and as the 
expectation of the inverse of U, as long as we condition correctly. 



2.1 Expected number of T-bearers. 

Before anyone is tested for T, U has a Bin(A r + 1, ^-distribution. When the 
crime is committed and it is observed that the criminal has T, we condition 
on Tc = 1 and obtain 



P a (U = k + 1 | I) 



P S (I\ U = k + l)P a (U = k + 1) 
P S (I) 



V 

N k )p\l-p) N - k - 

It follows that the probability that U = k + 1, given J, is equal to the 
probability that a random variable with a Bin(iV, ^-distribution takes value 
k, i.e., U | / is distributed as l + Bin(iV,p). Hence, writing E s for expectation 
with respect to P s , we have 

E S (U \I) = 1 + Np. 

Thus, the posterior probability of guilt is given by the inverse of the expected 
number of T-bearers, where this expectation takes into account that there is 
a specific individual - the criminal - who has T: 

Intuitively this makes sense: the criminal is a T-bearer, any one of the T- 
bearers is equally likely to be the criminal, and we have found one of them. So 
we have to compute the expected number of T-bearers, given the knowledge 
that C is one of them. 



2.2 Expected inverse number of T-bearers 

As we have seen, U \ I is distributed as 1 + Bin(A r , p). Therefore, one expects 
E S (U | I) — 1 + Np bearers of Y. If we in addition also condition on Y s = 1, 
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we compute 



P s (U = k\E,I) 



Ps{E\ 


U = k,I)P s {U = k 


I) 


P S {E\ 


I) 



N+l 



P S {U = k\I) 



l+Np 
N+l 

k P s {U = k\I). 



l + Np 



We use this calculation to obtain: 



N+l 

E^U-'lEJ) = ^2-P s {U = k\E,I) ( 2 - 6 ) 

k=\ 

N+l 1 , 

= Y ^ P s{U = k\I) (2.7) 

4^ kl + Np v 1 J K J 



1 



(2- 



l + Np 
Summarizing, 

P S (G \E,I) = (E S (U | J))" 1 = EsiU- 1 \I,E). (2.9) 

So P S (G | /, E) is in fact also equal to the expectation of U -1 , however, not 
oi U \ I but of U | /, -E. This can be understood in an intuitive way: both 5 
and C have T, they have been sampled with replacement, so the probability 
that they are equal is the inverse of the number of T-bearers. This number 
is unknown, so we have to take expectations, given knowledge of S and C. 

When we compare this explanation with the one of (12.51) . we see the 
importance of careful conditioning. 



2.3 Effect of a search, Yellin's formula 

So far, S and T s were supposed to be independent of each other. In this 
subsection, we consider a different situation. The random variable C rep- 
resenting the criminal is still supposed to be uniformly distributed, but the 
definition of S is different: we repeatedly select from X - with or without 
replacement - until a T-bearer is found, without keeping any records on the 
search itself, such as its duration. The T-bearer found this way is denoted by 
S; if there is no T-bearer in the population, we set S = *, and define r* = 0. 
As before we write E = {Ts = 1} and note that in this situation ICE. 
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As above, we are interested in P S (G \ E,I) which, since I C E, reduces 
to P S (G | I), and this conditional probability is easy to compute: 

N 

P S (G\E,I) = P s (G\I) = J2 k ' lp s(U = k\I) 

k=0 

= E^U- 1 | /). (2.10) 

This formula was published by Yellin in [TD] as the solution to this version 
of the island problem with a search. Sometimes, however, it is incorrectly 
quoted in the literature (e.g. in [TJ) as an incorrect solution to the island 
problem without search as we have discussed it. 



2.4 Conclusions 

• The classical version of the island problem is not difficult to solve, but 
the relation between the probability of guilt and the expected number 
of T-bearers is rather subtle. The basic formula is 

P S (G \I,E) = (E S (U | I))' 1 = E^U- 1 | I, E) ' 



1 + Np 

In the case of a search we have J C E and this leads to 

P S (G\E,I)=E S (U- 1 | I). 
These outcomes are independent of s. 

For the value of the likelihood ratio, it does not matter whether or not 
one interprets I as background information or as evidence - in both 
cases the value is 1/p and this quantity does not depend on any prior 
knowledge. 

The prior odds, the likelihood ratio and (hence) the posterior odds are 
all independent of s. 



3 Dependencies 

In this section we relax the condition that the T x are independent random 
variables or that S and C are independent. To this end, we define 

c x , y = P(T X = 1 | r„ = 1), (3.1) 
a x , y = P(S = x\C = y,I). (3.2) 
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3.1 Independent T 



First we assume that the T x are independent (not necessarily identically 
distributed) random variables, but C and S are not. This is the case, for 
instance, in a biased search situation. It also accounts for selection effects, 
where certain members of the population are more likely to become a suspect 
than others. We write p x for P(T X = 1). Now (12.1 j) becomes 



P S (E 


\G,I)P S (G\ 


I) P(E 


\G,S = s,I) P(G | 


3 = 8,1) 


Ps(E\ 


G^I)P S (G C 


\I) P(E\ 


G C ,S = s,I)P{G c 


\S = s,I) 



1 P(G,S = s | /) 



Ps P(Gc,S = s\I) 








1 P(S = s | C = s 


J)P(C = s\ 


P) 




Ps P(S = s\C ± s. 


J)P{C^s\ 


P) 




1 a s , s P(C ^ s | 


I) P{C = 


s 


I) 




y 1 P) p(c ^ 


s 


p) 



In this last expression (13. 3p . the first term l/p s is the likelihood ratio 
in case of a search such that a ss = a S;V for all y ^ s, i.e., such that the 
probability of selecting s is independent of C. In particular, this holds for 
a search where S is uniformly random but other distributions of (S, C) may 
also satisfy this criterion. 

The middle term in (13. 3p is the term that accounts for the bias of the 
search, i.e., it expresses the effect of the dependence between S and C in the 
case S = s. 

The last term of (13.31) is the "prior odds", the odds in favour of C = s, 
when / is taken into account. It is of course also possible to start from "prior 
odds" P{C = s)/P(C 7^ s); this will yield the same posterior odds, but a 
different expression for the likelihood ratio. We will make this explicit for 
some special cases later on. 



3.2 Arbitrary 1^ 

We now assume again that S and C are independent, but we drop the as- 
sumption that the T x are independent. In that case, we can write 



P S (G\ 


I,E) P S (E 


\G,I) P S (G\ 


Pi 


P s (Gc 


\I,E) P S (E\ 


G C ,/)P S (G C 


\P) 



P S (G | /) 
P S (E, G c | I) 
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Since we have assumed that the r, are independent of S and C, we have 

P s (E\I,C = y) = P(T S = 1 | T y = 1) = c a , y , 
and we continue as 



Ps(G | /) 


Ps{G | /) 


P.(E,G°\I) 


E y ^Ps(E,C = y\I) 




P S (G | /) 




E y ^Ps(E\C = y,I)P s (C = y\I) 




Ps(G | J) 



E w£ .c., y P.(C7 = y|^ 



1 P S (G C 


I) 


P S (G\ 


I) 


Ps Y, y ^Ps{C = y\ 


I) P S (G C 


\I) 



(3.4) 
(3.5) 

As for the case of a biased search, the term l/p s is the likelihood ratio 
that we obtain in the case where the T- correlations do not play a role, i.e., 
when c s>y = p s for all y ^ s. The middle term, analogously to (13. 3p . 

P S (C ± s | /) P.(C ^ g | /) 

accounts for the T-correlations, and the last term 

P B (G|7) = P(C = s\I) 
P S (G C \I) P(C±s\I) 

describes the prior odds, conditional on I = {Tc — 1}- If we remove this 
conditioning, we get 

P S {G\I,E) P S (G) 



P S (G°\I,E) Ey^ s Cy,sPs(C = y) 

1 P S (C ^ s) P S (G) 



(3.7) 
(3.8) 



p s Ey^ c -^ p s(G = y)Ps(G^y 

As for ( 13. 5p . the last line contains three terms: the likelihood ratio l/p s 
in the uncorrelated case, the term due to the correlation and the prior odds. 
Finally, note that (13. 4p and (13. 7p imply 

P S (G\I,E) _ P S {G \ I) _ P S {G) 

P S (G* \I,E)~ Cs, y P s (C = y\T)~ E^ s c y , s Ps{C = y) ( ' } 

(or equivalently, the symmetry between the middle terms in (I3.5P and (13.81) ): 
the way the correlation between the I\ appear in the posterior odds depends 
on whether or not one considers / = {Tc = 1} to be evidence, or an event 
upon which everything is conditional. 
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3.3 Comparison of biased search and T-correlations 



When we compare the posterior odds (13.31) and (13.51) of the two situations, we 
see that the expressions are very similar. Both have a correction factor in the 
denominator. In fact, when S and C are independent, then in (13. 5 j) P s can 
be replaced with P, and the two cases reduce to each other if a Xty /a XjX = c XiV 
for all x 7^ y. A trivial example of this is obtained when C is uniform on 
X and the r x are independent Bernoulli random variables. More generally, 
every case of a biased search without T-correlations where the correlation 
coefficients between criminal and suspect are such that < p y ^^- < 1 is 
equivalent (as far as the probability of guilt is considered) to where 
the search is unbiased but the T x are correlated with coefficients c x ^ y = Vy^r^- 



4 Heterogeneous populations 

In this section we consider the situation where the population consists of 
several subpopulations, each with their own T-frequency and each with their 
own probability of containing the criminal. To model this, we write X as a 
disjoint union of subpopulations X^. 

X = X 1 U---UX m , (4.1) 

with XidXj = whenever i ^ j. If x 6 X^, we say that x is in subpopulation 
i and write i = X(x). Let iVj = be the size of subpopulation Xj. We 
write N x = Ni if i = X(x). Let 

P(C G X t ) = fa, (4.2) 

where the /3j's are positive and satisfy X^iA = 1- We assume that the 
random variables T x are independent Bernoulli variables with probability of 
success px{x)] hence they are not identically distributed as their distribution 
varies for different subpopulations. 



4.1 Posterior probability of guilt 

It follows from the above that we have c X)V = p x for all x, y e X. Therefore, 
it follows from (13.51) and (13. 7)) that 

Ps(G\I,E) 1 P S (G | /) P S (G C ) P S (G) 

P S {G C | I,E) Ps P s {G c | I) E l= iPtA-PsPs(G)P s (G c ) 1 ' } 
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We can work this out in more detail in the case where S and C are 
independent and C is uniform on subpopulations: 

P(C = x | C G X(x)) = 1/N X . (4.4) 

This assumption is not a restriction, since we assume that all T x are inde- 
pendent. It is always possible to split up the population into parts such that 
the T x are i.i.d. on the parts and ( 14. 4 p holds (a trivial decomposition would 
be into singletons). 

First, we define to be the probability that C G X iy given that C has 

r: 

on = P C e X, t I) = — = = — . 4.5 

Pi 1 ) 22 j= iPjPj 

Now, P(C = x) = a x /N x and P(C = x \ I) = (3 X /N X , and g3J can be 
rewritten as 

P S (G\I,E) 1 q B 1 
4.2 Likelihood ratios 

It follows from (14.61) that, whether S and C are independent or not, the 
likelihood ratio conditioned on I is given by 

Ps(E\G,I) JL_ 

P S (P|G C ,J) p." 1 • ' 

If we assume independence of S and C and that C restricted to each sub- 
population is uniform, then we obtain 

Ps(I,E\G) = Ms-fa 

P a (I, E | G c ) N s Y% =x PjPj - PsPs ' 1 ■ ] 

We note two special cases. First, when N s is large which means that the 
prior probability of guilt for s is small), (14 .8p is approximately equal to 

(4.9) 



in which the subpopulation to which s belongs plays no special role. A second 
special case arises when we take N s — 1, and only one other subpopulation. 
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This is the standard practice for many forensic labs: there is a default popu- 
lation (the local population), and only two hypotheses are considered: either 
S = C, or C is from the default population. In that case, the likelihood ratio 
(14. 8 p is equal to 

— , (4-10) 

Pdef 

where pdef is the T-frequency in the default population and Pdef, the prior 
probability that C is from the default population, is equal to 1 — (3 S . 

4.3 Discussion 

It seems that (at least) two likelihood ratios can be used to answer the 
informal question "What is the weight of the evidence that the suspect has 
the same characteristic as the criminal?". Contrary to the classical case 
described in Section |2j the weight of the evidence depends on whether or not 
we consider the fact that the criminal has T to be evidence or background 
information. Depending on that choice and on the prior odds on guilt for S, 
we may arrive at the reciprocal of either p s , Pdefi OI YliPjfij- These quantities 
may be very different. This articulates the fact that one should be very 
careful with the use of such likelihood ratios, and that one should primarily 
be interested in posterior odds rather than in likelihood ratios. A similar 
warning in a different situation can be found in [6] and [?]. 

On the other hand, if one wants to divide the ingredients in the compu- 
tation of the posterior odds into parts that are for the court to decide, and 
parts that are for an expert witness to provide, one faces difficulties. We will 
now go into these in some detail. 

4.3.1 Choice of evidence 

The difference between the choice of conditioning on I or not, is directly 
related to the difference between the questions 'What is the probability that 
S has T, if innocent?" and "What is the probability that C has T, if S is 
innocent?"; or more informally "How else can we explain that S has T?" 
versus "How else can we explain that C has T?" Indeed, if we consider both 
I and E as evidence to be expressed by a single likelihood ratio, then we 
can first consider E, and then / given E. But without knowledge of /, the 
probability that S has T is the same under G as under G c , so the likelihood 
ratio of I and E together is in fact the same as the likelihood ratio of /, given 
E. Thus, the issue here is that we need to decide if the fact that C has V 
counts as evidence against S, or not. Should the fact that C has a certain 
characteristic count as (legal) evidence against someone, because he belongs 
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to a subpopulation in which the characteristic is more common? Or do we 
only consider the fact that S has the characteristic, knowing that C has it, 
as evidence? It seems unlikely that an answer can be given in full generality, 
but it is important to realize that the value of the evidence will depend on 
it. 

4.3.2 Role of expert 

Legal systems generally wish to make a distinction between the strength of 
the evidence, and the strength of the case. Ideally, the expert witness informs 
the court about the strength of the evidence (i.e., gives a Likelihood Ratio), 
and the court combines this information with its prior to draw conclusions 
about the strength of the case. The prior is not discussed with, or commu- 
nicated to, the expert. Hence, for this to be possible, the likelihood ratio 
should not depend on the prior of the court. Looking at (14.81) however, it is 
apparent that this likelihood ratio does depend on the prior probabilities 
and on the suspect's population size N s . The value of the legal evidence, if 
taken to be both / and E, thus is a function of the prior and seems as such 
to be generally not admissible in court. In the special case (I4.10p . however, 
it is; but in that case we only obtain useful information if the assumption 
that either S = C, or C is from the default population, is justified. 

The Likelihood Ratio (14. 7p does not suffer from these problems: it is a 
function of the suspect's subpopulation only, irrespective of any prior, on S 
or on any other person or group. Thus, if a court has somehow arrived at 
a prior probability a s = P(C G X s | I), it can use the expert's information 
p s to proceed. But it must now be made clear to the court that there is a 
distinction between the priors with or without I taken into account, and that 
to compute one from the other it also needs expert information. 

4.3.3 In practice: which likelihood ratio? 

We end this discussion by pointing out some pro's and cons of the likelihood 
ratios (14. 7p and (14.81) . Clearly, (14.71) only involves the suspect. This is a con- 
ceptually satisfactory property, since it allows for a clear distinction between 
prior probabilities and the value of the evidence, as we have pointed out 
above. It may also provide a safeguard against using irrelevant information 
as evidence. Consider, for example, the following hypothetical scenario: at 
a crime scene, a hair of C is found. Analysis by a forensic hair expert shows 
that C must belong to subpopulation X\. Later, a suspect S G Xi is found. 
From the hair a mitochondrial DNA profile is generated, and S"s mitochon- 
drial DNA profile matches with it. The court wishes to be informed about 
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the value of that match. Clearly, it only makes sense to report p s , since it is 
at this point already known that S and C are from the same subpopulation. 
But the DNA expert may not know this, and if it is standard procedure to 
report a variant of (14. 8p . e.g. (I4.10p . then a profile frequency for the default, 
or even the world's population, could be reported. 

On the other hand, an advantage of (14.81) is that it reduces the value of the 
evidence if there is a plausible alternative to S for C : if there are other groups 
in which T is relatively frequent, and which have a positive prior probability, 
then (14. 8 p decreases whereas (14. 7p does not. But as we have seen, (I4.8P can 
only do this because it makes use of all the prior probabilities, and as such 
it is likely to be inadmissible as legal evidence, especially if the court leaves 
the choice of prior to the expert. A possible way out would be for the expert 
to report all the pj separately to the court. 

Of course, in practice p s may be hard for the expert to determine, because 
he only has data about other populations, or because it is not immediately 
clear to which subpopulation S belongs, or even what the subpopulations 
themselves are. In that case, it may be practical (though potentially dan- 
gerous) to use (14.101) and report pd e f (together with the hypotheses!), if it is 
the only statistic concerning T that the expert has knowledge of. 

The difference in numerical value of (14 .7p and (I4.8P may lead to the pros- 
ecution and defence having different preferences for the use of (14. 7p or (a 
variant of) (14. 8p . For example, if p s is much smaller than the weighted mean 
YlVjfiji the prosecution will prefer (14.71) . but the defence will point out that 
in the population as a whole, there are subpopulations in which V is much 
more common, and therefore try to persuade the court that (14.81) better re- 
flects the value of the match. The court should realize that both points of 
view can be justified: the prosecutor focuses on the suspect and comes up 
with the likelihood that S has T, if not guilty; the defence focuses on C and 
points out that 5* need not be C, since there are other good candidates. The 
court should realize that these arguments can be both valid. 

To better understand the influence of uncertainty about the T-frequencies 
in the different populations and about the suspect's and the criminal's sub- 
population, we proceed with a more detailed model involving these issues in 
Section 

4.4 Expected number of T-bearers 

If we choose /3j = Ni/N as we did in the classical case, then we can again 
express the posterior probability of guilt as the inverse of the expected num- 
ber of T-bearers. We compute E 8 (U | I,C G X s ) = J2i^iPi + 1 — Ps — 
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N s YliiPijt ~ Ps + 1) an d from (14.61) it follows that 

f '' C "'^ E, W1 Ai,) ' < 411 > 

which is the analogue of (12.51) . The reader may check that similarly, 

P S (G | I,E) = E S (U^ | I,E,C G X s ). 
This is the analogue of (12. 9p . 

4.5 Without conditioning on S = s 

Assume that S is uniformly distributed on X, and suppose we do not con- 
dition on {S = s}. Concentrating on the conditional probability of G we 
obtain 

P(G \I,E) = J2 p ( G \I,E,S = s)P(S = s\I,E). (4.12) 

sex 

The first term in the summation is computed above already, so we need only 
to compute P(S = s \ I,E). Since information about S and its T-status does 
not say anything about Tc, we have that 

P(S = s | J, E) = P(S = s\ T s = 1) 

P(T S = 1\S = s)P(S = s) 



^ seX P(T s = l\S = s)P(S = s) 

Ps 



Hence it follows that 



P(G | E, I) 



Y,s£XPs Z s 

^sexPs 



where 



Z S = P S (G\I,E) 



p s (N s - a s ) + a s 

Hence the posterior probability of guilt is a weighted average of the condi- 
tioned ones, with weights p s . 
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4.6 



Conclusions 



The probability of guilt in this situation is equal to 



P S (G\I,E) 



p s {N s 



a s ) + a. 



s 



and this answer depends on s via the frequency of T in the subpopula- 
tion of s, the distribution of C and the size of the subpopulation of s. 
The sizes of the other subpopulations do not play a role other than in 
the assessment of the and thereby of the a iy i.e., in the distribution 



• For the value of the likelihood ratio, it does matter whether or not / is 
interpreted as background information or evidence. For the probability 
of guilt this distinction is - of course - irrelevant, but we have seen that 
there can be reasons to have preference for a particular choice. It 
is preferable to use a likelihood ratio which does not involve any prior 
knowledge. The prior should then, in theory, be estimated by the juror. 

• The probability of guilt, conditioning only on the fact that the sus- 
pect has T but not on the identity (subpopulation) of the suspect, is 
the weighted average of the individual conditional probabilities, with 
weight factors p s . The sizes of the subpopulations and the distribution 
of C do not play a role in the weights. 

5 Uncertainty about the frequency of T 

In this section we assume that the T- frequency P(T X = 1) is not known with 
certainty. Instead, we describe the frequency with a probability distribution. 

5.1 Classical case 

We assume that there are no subpopulations. The random variable C is 
uniform on X, and S and C are independent. To model the uncertainty of 
the T-frequency, we assume that there is a random variable W, taking values 
in [0, 1] and with density \i such that conditional on W = r, the T x are 
independent Bernoulli variables with P(T X = 1) = r. We let p denote the 
expectation of W and a 2 its variance. We again condition on S = s whenever 
we compute odds, but all results in this section are independent of s. 

Definition 5.1. The distribution ofW is called the prior-to-crime distribu- 
tion and the distribution of W conditioned on I is called the prior-to-suspect 



of a 
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distribution. Finally, the distribution of W conditioned on both I and E 
is called the post-match distribution. The densities of these three random 
variables are denoted by x, Xi an d Xi,e respectively. 



Since 



P(I)= f P{I\W = t) X {t)dt= [\ x (t)dt=p, 
Jo Jo 



the continuous version of Bayes' theorem implies that 



P(i) p 



Furthermore, we have 



To see this, note that 



P(E\W = t,I) X i(t) 

xi At) p^Yf) (5 - 3) 



(l + N)p" 
1 + N 



(5.4) 



and compute the denominator: 

P{E\I) = f P(E\W = t,I) X i(t)dt 
Jo 

f 1 1 + Ntt . . , 
- I Y^-xm (5.5) 

-{p + N{p 2 + a 2 )) (5.6) 



(5.7) 



(5.8) 



From this, the claim readily follows. 

The expectation of W given / is expressed in terms of X by 

p> : = E (W | /) = ftxi{t)dt = f -t 2 X (t)dt = V + a 2 ). 
Jo Jo V V 

The expected number of T-bearers, given I is now given by 

E(U | /) = / E(U\I,W = t) X i(t)dt = I (1 + Nt) X i(t)dt = 1 + Np'. 
Jo Jo 

(5.9) 
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As in the classical case where a 2 = (cf. (12. 5p ). the inverse of this expression 
is equal to the posterior probability of guilt, since 



(5.10) 



P S (G\I,E) = [ P s (G\I,E,W = t)xi, E (t)dt 
Jo 

f 1 1 1 + Nt , , 

= L i + Nti + N( P + *y P ) Xldt (5 ' n) 
1 _ 1 

1 + N(p + a 2 /p) ~ 1 + iVy' 

Since the prior probability of guilt is just 1/(N + 1) as before, the likelihood 
ratio is 1/p'. Since this likelihood ratio is not controversial in this case, 
we concentrate on the posterior probability of guilt in terms of the various 
conditional distributions. 

As in the classical case (cf. (12. 8p ). we also have P S (G \ I,E) = E s (f/ _1 | 
I,E). Indeed, 

E s ([/ -1 | I,E) = [ E^U- 1 \I,E,W = t)xi, E (t)dt (5.13) 
Jo 

' :rW ^At( N l 2M t)dt (5.14) 
o 1 + Nt 1 + N (p + a 1 ) 

' (5.15) 



1 + N(p + a 2 /p) ' 

The expectation p' only depends on \ an d n °t on the population size. 
This is to be expected, since learning that a (randomly chosen) population 
member has V is not informative about the population size. This changes 
when we learn E, the fact that a randomly selected islander has T as well. 
Indeed, in a small population this is more likely to happen since we are more 
likely to accidentally select the criminal. In the extreme case where N = 0, 
E can not offer any new information, but for other N, it does. It follows 
from (J5J2) that 

p" := E S (W \I,E)= [ t X i, E (t)dt 



i r t 

+ Nt)~x{t)dt 



1 + N(p + a 2 /p) J p 
1 f a 2 N 



1 + N(p + o 2 jp) 

We can also write 



(p+ — + —[ t 3 x (t)dt] 
V P P Jo J 
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if we want to express p" in terms of xii where o%\i denotes the variance of 
Xi- The above formula can be rewritten as 

„ l + Nja^j/p^) , 

p =p TTAy ^' 

with equality only if crjLj = or = (as expected, cf. the remark above). 

It is perhaps worth mentioning that one can reconstruct \i from xi,e and 
X from xi- Indeed we have 

Xl{t) = TTWt[J TTnY) (5 ' 16) 

and 

(*/ ~T dX ) Xl{t) - (5 ' 17) 
To see this, note that from (I5.2p we have 

1 + Np' , . , 

xi® = YTm XlAt) - (5 ' 18) 

On the other hand, it follows from (I5.15P that 

EsiU- 1 \I,E)= I —L- XliE ( s )ds 



o l + Ns^ ' y ' 1 + Np' 



and the first claim (15.161) follows. 

For (15.171) we simply note from (15.11) that 



X{t) = (5-19) 



where p = tx(t)dt. Integrating this equation gives 

l=p rm it=1 



and this expresses p in terms of xi- Substituting this into (I5.19P gives (I5.17p . 
As a conclusion, we have seen that 

P > P > P: 

so one has 

1 1 1 

< < 



1 + Np" - 1 + Np' - 1 + Np 
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5.1.1 Conclusions 



The basic formula of Conclusions 12.41 still holds: using (15. 9p . (I5.12p and 
(I5.15p . we see that the probability of guilt is given by 

P S (G \I,E) = EsiU- 1 \I,E) = E S (U | I)' 1 = - 1 



1 + Np> 

The conditional probability of guilt expressed in terms of x is 

P S (G \I,E) = - — — - . (5.20) 

V 1 ; 1 + N(p + a 2 /p) y ' 

Therefore, ignoring the uncertainty (i.e., using p instead of p'), is un- 
favourable to the suspect. If, on the other hand, one incorrectly as- 
sumes that there is uncertainty, then this is favourable to the suspect. 

The conditional probability of guilt expressed in terms of xi is 

^I'-^TW (5 ' 21) 

In this case, the uncertainty in xi is irrelevant in the sense that its 
variance plays no role. 

The conditional probability of guilt expressed in terms of xi,e is 

P S (G | /, E) = jf * ^l^j^dt. (5.22) 

Ignoring the uncertainty in xi,e (obtaining P S {G | J, E) = 1/ (1+Np")) 
would be favourable to the suspect. 



5.2 Uncertainty about the criminal's subpopulation 

Suppose that, as in Section @J the population is divided into subpopulations 
X = Xi U ■ • • U X m , and that C has characteristic T. We let Wi be the 
random variable modelling the frequency of T in Xi. The expectation resp. 
variance of Wi are denoted by pi resp. of. So, if X(x) ^ X(y) then T x and Y y 
are independent, and furthermore conditional on Wi = Pi the for x G X{ 
are independent Bernoulli variables with probability of success pj. We write 
E = E s and G = G s as before. Contrary to the situation in I4.1[ the division 
of X into subpopulations is a real restriction: the T x are only independent 
between subpopulations, not within one (only exchangeable). 
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5.2.1 Unconditioned on I 

We first interpret I as evidence, not as background information. The poste- 
rior probability of guilt, given that S = s G X s and C has T, is 

P S (G \I,E) = P.(G\Ce X a , J, E)P S (C e X s \ I, E). 

The first term in the right hand side equals (see (15.121) ) 

P S (G | C G X s , I, E) = , — ; ■ — 7T/ — \> 

l + (N s - l)(p s + a 2 s /p s ) 

since we are now back in the setting of a homogeneous population. The 
second term equals (cf. (l5.7p . P~o"|) ) 

P S (C G X s | I,E) = 



P S (E 


C eX s ,I)P s (C eX s 




P S (E\ 


I) 



l+(N 3 -l)(p a +a 2 Jp 3 ) p s /3 s 

P S {E\I) 

It remains to compute P S (E \ I): 

m 

P S {E\I) = Ys P ^ E \° eX 3) p s{C eX 3 \I) 

y p Pih | i + (^-i)(p a + ^ 2 /ps) Psh 

N s Y., y,-;/^ 'J'j + (1 + CjV. ~ l)(Ps + V 2 s /Ps))f3sPs 

N s EfcLl PkPk 

Putting the parts together yields 

P S (G \I,E) = s 5 . (5.23) 

This is the analogue of (14. 6p . For large N s , the probability of guilt is roughly 
equal to 

P S {G \I,E)a ^ . (5.24) 

The odds on guilt are then roughly equal to 

P,(G\I,E) _ 1 _ A 1 
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The weight of the evidence, the likelihood ratio, is given by 



Ps(EJ\G) _ N.-P. 

P S (E, I | G c ) N s E7 =1 Pjfij + N s (p' s - p s ) - p'J s ' 1 ' ; 

2 _____ 

(where p' s = p s + y- as before) which reduces to (14.81) if p' s = p s . For large 
populations, (I5.26P is roughly equal to 

1 



(5.27) 



as is also clear from (I5.25p . This formula is the analogue of ( 14. 9p . The 
likelihood ratio ( I5.27P suffers from the same problem as (14.91) in the sense 
that the prior probabilities (3 S are needed to compute it. For the same reason 
as before, it is therefore highly questionable whether the expert is allowed 
to report (I5.27P in court. Therefore, we proceed by working conditional on 
I and see what the computations tell us there. 

5.2.2 Conditional on / 

Now, / is interpreted as background information. Let, as in (14. 5p . 

a. = P 8 (C (= X. \ I) = * aPa 
Then ( I5.23P can be rewritten as 
P S {G\I,E) 

where 



Pi = pl + — 

Pi 



is the expectation of W{ given C e Xj. 

Since the prior odds, conditional on /, in favour of guilt of s G X s are 

P S (G | /) a s 



P S (G C \I) N s -a s ' 
the corresponding likelihood ratio is equal to 

P S (E\G,I) N s -a s 



P S (E | G c , J) N sPs (l - a.) + a s (l + (N s - l)p' s ) 

N M - a. 



N s a s (p' s - p 8 ) + N s p s - p' s a s ' 
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Of course, this leads to the same posterior probability of guilt as in (|5.24p . 
Notice that when p' s = p s , then this reduces to l/p s , i.e., we retrieve ( 14. 7p . 
For large N s , the likelihood ratio is roughly equal to 

= . (5.28) 

Ps + a s (p' s ~ Ps) p 2 s + a s af 

This likelihood ratio also depends on prior quantities, this time on a s . Note 
however that there is a difference between (15.271) and (I5.28p . The latter only 
depends on quantities associated to the suspect's subpopulation, whereas the 
former does not. In this case there is a way to deal with the problem of having 
a prior quantity entering the formula for the likelihood ratio. In (15. 28ft one 
can be conservative and take a s — 1 to obtain a number which is not larger 
than the true likelihood ratio. In (I5.27P one can of course do the same for all 
(3j's but there we have the problem that we have various (3j in the expression, 
and the only thing we know is that they add up to 1. Therefore, we prefer 
(I5.28p . but the usual care must be exercised when using this likelihood ratio 
in court. The use of this likelihood ratio is, as always, dangerous and should 
involve a discussion of priors. A likelihood ratio out of context is not useful, 
and unfortunately, the context is rather complicated. 



5.2.3 Conclusions 

• As in the case without uncertainty about the T-frequencies, we obtain 
two likelihood ratios that quantify the weight of the evidence: for large 
populations these are (I5.27P if the evidence is taken to be (J, E) and 
(I5.28P if the evidence is taken to be only E. Sine (I5.28P can be easily 
be turned into a conservative bound by setting a s — 1, we prefer to 
use (15.281) . noting however that a report mentioning just the likelihood 
ratio without context is dangerous and potentially misleading. 

• Only the uncertainty about the frequency of T in the suspect's subpop- 
ulation plays a role in the likelihood ratio and the posterior probability 
of guilt, the uncertainty in the other subpopulations does not. The 
effect of this uncertainty is weighted by the probability that the true 
culprit belongs to this subpopulation. 

• As in the classical case, if one conditions on / then the likelihood ratio 
given by (I5.28P for large populations, only contains quantities associ- 
ated to the suspect's subpopulation. 

• Contrary to the classical case, if one considers the evidence to be (/, E) 
then in the likelihood ratio for large populations (given by (I5.27P ) the 
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suspect's subpopulation plays a special role, through the uncertainty 
about the T-frequency in this population. 

• Regardless of whether one lets the evidence be J, E or only E, the 
greater the uncertainty, the lower the weight of the evidence. 



5.3 Uncertainty about the suspect's and the criminal's 
subpopulation 

Suppose now that it is also unknown to which subpopulation s belongs. In 
that case we can no longer condition on S = s, but we can use the results of 
the previous section by writing 

m 

P(G \I,E) = J2 P ( G \SeX u I, E)P(S eXi\I,E). (5.29) 

We have determined the P{G \ S — s, I, E) in (15.231) . and it is not difficult 
to see that this is equal to P(G \ S G Xi, I, E) whenever s G Xi. Hence, we 
only need to compute P(s G Xi \ I,E). 

The distribution of S plays a role now, and we define 

e, = P(s G Xi) 

to be the probability that S belongs to X^ Then the a priori probability of 
guilt is 

p(G) = p(c = s) = J2 p ( s e ^) p (C = s\seX l ) = J2 

1=1 8=1 * 

Recall that 0i is the probability that C G Xi and that we assume a uniform 
distribution over each subpopulation. 
We now compute P(S G Xi \ I,E): 



P{SeXi\I,E) 



P(E 


S eXi,I)P(S eX { 


I) 


P(E 


I) 



P(E 


S eXi,I)P(S eXi 


I) 


£7=i P(E 


S G Xj,I)P(S G Xj 


I) 



(5.30) 



It remains to compute P(E | S G Xj,I) and P(S G X{ \ I). The latter is 
easy: since I is information about C and not about S, we have 

P(S e Xi\ I) = €i. 
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The former can be computed as follows: 

m 

P(E | S e Xt, I) = p ( E \C eXj.S e Xi, I)P(C g X 7 | 5 g X u I). 
j=i 

Now P{C e Xj \ S e Xi,I) is the probability that C belongs to Xj, given 
that S has been selected from Xi and that C has T. However, nothing is 
given about 5"s T-status and therefore S G Xi can not be informative about 
C at all, hence 

P{C e Xj \ S e X h I) = P{C e Xj | /) - Pj ^ j 



It remains to evaluate the terms P(E \ C £ Xj, S £ Xi, I). Hi ^ j then S 
and C belong to different populations. If i = j then (15.71) applies, so 



P{E \SeXi,Ce X 3 ,I) 



Pi * ± j, 

l+(JVi-l)(pi+o?/Pi) 



If we put these ingredients together, we obtain after some computations: 



P(E | S e X t , /) = Pl ( f>& + §-(1 - P* + (Ni - l)a 2 Jp 



Plugging this into (I5.30p . we obtain 

PiiYZiPA + -Pi + - l)tr|/ Pi ))e, 



P(SeXi\l,E) = 



Z? =1 pAZZiPA + |(1 -Pi + - l)a]/pj))ej 
PitiPi 1 



NiP a (G\I,E,SeXi) J2 m =1 



(5.31) 



■3=1 NjP a (G\I,E,SeXj) 



Substituting this expression into (15. 29 p . we arrive at the posterior probability 
of guilt: 



Em pietPi 
i=l 

Em 
i=l NiP a (G\I,E,SeXi) 



?(G\I,E) = — . (5.32) 



Although this is not immediately obvious in the above presentation, the 
expression (I5.32p is symmetric in e and /3. To show this, notice that we only 
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have to prove it for the denominator. Denoting 



m „ m 
1=1 2 jr' = l 



we compute 



f(e,/3)-f(P,e) = J>A E^/ft-EP;^ 

i=i \i=i i=i 

m / m m 

= J^Pi ( ei £ - & S p ^ 

i=l \ j=l j=l 
rn 

= ^(VitiVjfij -PiPiPjtj) 
= 0. 

Intuitively, it is clear that f)5.32p must possess this symmetry. Indeed, we 
have an unknown criminal C and a suspect S, both with T. The probability 
that S = C depends, as far as e and /3 are concerned, on how they allow 
for S and C to be issued from the same subpopulation. Exchanging the 
distributions e and j3 should not make a difference. 

To conclude this section we sketch the behaviour of (15.3ip in extreme 
situations. 

5.3.1 Probability that S E Xj for extreme situations 

• If all cr| = and the Nj are very large (compared to the pj 1 ), then 
(I5.3ip is approximately equal to 



P(S e X t | I,E) « =ff? = P(S e X t | E). 

This is reasonable, since if Pi is big compared to 1/iVj, then it is very 
unlikely that C = S even when T is taken into account. In this case, 
knowing that C has T does not really alter our belief about S"s sub- 
population which we have based on E. 

If all (Tj = and the pj are small compared to the 1/Nj, then 



P(S eX i \I ) E)^ — P%N ^\ = 1 R - . (5.33) 
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If 6i = Ni/N, then flQ3j) reduces to 



P(S G X, | /, E) « ^ fl = P(C G X | /), (5.34) 

which is also reasonable, since for very small T-frequencies it is quite 
likely that C = S. 

• If also 0i = Ni/N, then f!5.33j) reduces to 

N-n- 

PiSeXil^E)* , (5.35) 

and this is also understandable: if there is no information about the 
identity of C or S, then the probability that S G Xi is proportional to 
the expected number of T-bearers in that subpopulation. 



6 Database search 

In this section we suppose that there is a database T> C X containing the 
T-status of individuals x\, . . . , x n . After possibly renumbering, we write X = 
{xi, . . . , xat+i} and let V = {xi, . . . , x n }. Suppose that J2deD — k, that 
is, there are k matches in the database. Let the evidence E v be given by 

E v = {r xi = ■ ■ • = T Xk = 1, T Xk+1 = ■ ■ ■ = T Xn = 0}. 

We also assume that P{C — \ I) — a>i and that each individual has T with 
probability p. 

There are several pairs of propositions whose support by the data can 
be considered. These propositions all give rise to their own likelihood ratios 
or posterior probabilities, which has caused considerable confusion in the 
literature; see [B] for an account on this. Some of the forthcoming discussion 
also appears in [6] but we recall it here for completeness. 

We will discuss three ways of looking at database matches. The most 
interesting case is where the database search produces a single match. Indeed, 
if there are no matches then the inquiry comes to an end as far as the database 
is concerned and if there are several matches, then it is clear that chance 
matches have occurred: 

1. Database-focused: in this case, the quantity of interest is P(C G T> \ 
Ex>, I), the probability that the criminal is in the database; 
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2. Individual-focused: in this case, the quantity of interest is P{C = X\ \ 
Ed, I), the conditional probability that C = x\ supposing that x\ has 

r ; 

3. Database effectiveness: in this case, the quantity of interest is P(S = 
C | Ei, I), the probability that S = C where Ei denotes the event that 
k — 1 (a unique match, but not specified with whom), and where S is 
the label of the matching individual. 



6.1 Database- focused 

First, we consider the proposition, found e.g. in [8], 

C e V, 

and its negation C ^ T>. The prior odds in favour of C G T> are 

P{C G V | /) _ ai + ■ ■ ■ + a n 
P{CiV\I)~ a n+ i + ■■■ + a N+ i ' 

where = P{C — X{ \ I) is the probability of guilt of given that C has 
r. Clearly, 

P{E V \C £V,I)= P k {l- P ) n - k . 
Similarly, it is easy to see that 

p*- 1 (l-p) n -*(ai + ••• + «*) 



P(E V \CeV,I) 



a>i + \- a n 



and therefore the likelihood ratio of evidence Ex> in favour of C G T> is equal 
to 

P(E V \ CeV,I) «! + ••• + a k 



P(E V \C£V,I) p(ai + --- + a n 
The posterior odds in favour of C G D are 

P(C &V\E V ,I) ai + --- + a k 



P(C£T>\E V ,I) p(a n+ i + --- + a N+ i)' 

If k — 1, C G T> becomes logically equivalent to C — xi, and we have 

P(C = xi | 3p, I) P(C = xi | E V ,I) 
P{C£V\ E v , I) ~ P(C jt xi | E v , I) 

ai _1P(C = Xi\I) 

p{a n+ i + ■■■ + a N+ i) ~pP{C<£V\I) 
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(6.1) 



(6.2) 



(6.3) 
(6.4) 



This means that the likelihood ratio is uncontroversial and equal to 1/p. In 
fact, it is not difficult to show that ( 16 .3p also holds when the probability of 
having T differs among the individuals in the database. In that case, p in 
( 16. 3p should be replaced with pi = P(T Xl = 1 | /). Therefore, the weight of 
the evidence is not influenced by the presence in the database of people of 
different ethnic origin other than by the determination of the a,. 

6.2 Individual-focused 

Of course, the proposition C G T> is not really of interest to a court. Rather, 
presented with an individual x such that T x = 1, a court is interested in 
P{C = x | E V) I). Therefore, suppose as above that there are k hits in 
the database, namely x\, . . . , x k . A computation analogous to the above one 
shows that the posterior odds in favour of C = X\ are 

P(C = Xl | Ey,I) = Oi 

P(C ^ Xi | E v , I) a 2 H ^a k +p(a n+ i-\ h a N+i )' 

Notice that, if k = 1, we retrieve (16.31) . as we should. 

6.3 Database effectiveness 

The most interesting case is when the database produces a unique hit. In 
that case, as we have seen, the posterior odds in favour of S = C are given 
by ( I6.3p . In this section we investigate a related, but different probability, 
namely the probability that if we have a unique database hit, that it is with 
the true culprit. This probability represents the long term effectiveness of the 
database in selecting the correct individual in the cases where it produces a 
unique match. We let E\ denote the event that there is exactly one T-bearer 
in the database, and we will calculate 

P(S = C\E 1 ,I), 

where S is the unique individual in the database with T. To do so, we write 

n 

P(S = C I E h I) = J2 p (S = C | Y Xi = 1, E l7 1)P(T Xi = 1\E 1 ,I). 

8=1 

First notice that (16.31) gives 
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and it remains to compute P{T Xi = 1 | Ei,I): 

P{Y Xi = l\E 1 ,I) = P(C = x i \E 1 ,I) + ±-P(C<£V\E 1 ,I) 



P{E 1 


C = x h I)P(C = Xi 


> n 


P(E 1 


C i V 


i)p(c i v 


\I) 


P(Ei 


C = Xi,I)P(C = Xi 


P(Ei 

' n 


I) 

■P(Et 


C <£V,I)P(C iv\ 


I) 



P{E l \C eD, I)P(C eD\I) + P(E 1 \C£V, I)P(C £V\I) 



ai+pP(C iV\ 


I) 


P(C e D 


I) + npP(C <£ V 


I) 



where in the last step we used that P{E\ \ C E V,I) = (1 — p) n 1 and 
P{Ex \C£T>,I)= p(l - p) n ~ x . It follows that 

n 

P(S = C\E 1 ,I) = 



Oii 


ai +pP(C <£ V 


I) 


ai +pP(C iV\ 


I) P(C e D 


I) + npP(C <£ V 


I) 



P(C e V 


I) 


P[C e D 


I) + n P P{C i V 


I) 



which can also be written in odds form: 

P(C eV | E U I) _ P(S = C | E l7 1) 1 P(C eV\ I) 
P(C <£V\ E lf I) ~ P(S^C\Ex,I) ~ ^PjC iV\iy 



(6.6) 



with corresponding likelihood ration 1/np. If the database is comprised of 
individuals coming from different subpopulations, then (I6.6P does not hold. 
However, in that case one may view the database as a disjoint union T> = 
T>x U • • • U D m , where V m is the subset of V containing individuals from 
subpopulation i. For each of these separately, (I6.6P holds. 

It is rather interesting to see what happens with the odds on S = C 
(given Ex and 7) when the size of the database grows. It may seem from 
(16.61) that as n grows, the odds on S = C decrease. However, this is not 
true in general, since P(C £ T> \ I) may also depend on n. It does, however, 
mean that enlarging a database does not necessarily improve its effectiveness, 
in the sense of increasing the odds (16.61) on a unique match being with the 
true offender. For example, suppose that a database V n of size n yields 
P{C G V n | I) = q n , and that a larger database T>2 n of size 2n yields 
P(C e V 2n I I) = <?2n- If P>n C V 2n then naturally q 2n > q n , but the 
probability that S = C given a unique match in V 2n is greater than the 
probability that S = C given a unique match in T> n only when 

1 - ?2n 1 - Qn ' 
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This can be explained intuitively: if one adds many people who are unlikely 
to be C to the database, then the probability of a chance match with one 
of these new individuals outweighs the fact that the probability that C has 
been added to the database has increased in the sense that it becomes less 
likely that a unique match actually is a match with the criminal. 

Hence the value of a unique match may increase or decrease with the size 
of the database, and it is not hard to see that the probability of a unique 
match itself may (independently) decrease or increase. 

6.4 Conclusions 

• If it is known with whom the match is, say with x«, then (cf. ( 16. 3p ) the 
posterior probability of guilt is given by 

P(C = Xi | E v , I) = aj 

p(c^ Xi \E v ,i) P p{civ\iy 

Notice that this quantity only depends on en, = P(C — X{ \ I), on the 
likelihood p of a chance match with Xi and on the a priori probability 
that the database contains the criminal. As the database increases, 
P{C V | I) decreases but depending on oti/p the posterior proba- 
bility P(C — Xi | Ed, I) may be greater or smaller than for a smaller 
database. 

• If it is not specified with which individual the match is, and the prob- 
ability of having r is p for everyone in the database, then the posterior 
probability that the match is with the criminal is given by, cf. (16. 6p . 

P{S = C\ E U I) 1 PjC eV\I) 

p{Sjtc\ e u i) ~ ^pP(c $v\iy 

These odds describe the long-term behaviour of the database, i.e., the 
proportion in the long run of unique matches that are matches with 
the true criminal. Naturally, enlarging the database always increases 
the probability that the criminal is contained in it. But the probability 
of a unique match may increase or decrease, and (independently) the 
value of a unique match may increase or decrease. In many cases, in an 
enlarged database the probability of a unique match increases, but the 
probability of a unique match being with the true offender decreases. 
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7 Examples 



In this section we illustrate the obtained results by considering some exam- 
ples. We have chosen to cast most of these examples in a DNA-setting, as this 
provides one of the few types of forensic evidence that are so well understood 
that more or less exact computations can be performed. 

The uncertainty surrounding DNA-profile frequency estimates depends 
on the size of the database from which allele frequencies are estimated. A 
possible model is to define a prior distribution of allele frequencies, and to 
update this distribution with the database to obtain a posterior distribution. 
An often used approach is to use Dirichlet distributions (see [H] for an account 
of the method and a discussion on the sensitivity for the choice of prior). 
Doing this for a database containing alleles of 230 persons (for many forensic 
labs the actual size of their database is a few hundred individuals), it seems 
(based on simulations for DNA-profiles with six or seven loci and frequencies 
between 10~ 10 and 10~ 7 ) reasonable to use a standard deviation p/3 < a < 
2p/3 in the below examples. 

We will in each example freely use the notation introduced in the section 
that it illustrates. 

7.1 Classical island problem with uncertain T-frequency 

We start with the simple version of a homogeneous population X of size 
N + 1 and profile frequency p. As we have seen (cf. (I5.20p and (15.211) ). the 
posterior probability of guilt is equal to P S (G \ I,E) = 1/(1 + N(p + a 2 /p)) = 
1/(1 + Np'). With p/3 < a < p, we get p' G [9p/8, 13p/9]. Thus, the effect 
of the uncertainty about p is to effectively increase p, or equivalently, to 
decrease the likelihood ratio associated to /, E or to E. It may be prudent 
to use a = p. For example, with iV = 10 7 ,p = 10 _8 ,<7 = p, we have 
P S (G \I,E) = 0.83 instead of 0.91. 

7.2 Subpopulations and likelihood ratios 

We now illustrate the results of Section HI Suppose that a crime has been 
committed in a heterogeneous population X = X\ U X 2 , with Ni = 10 7 and 
N 2 = 10 5 . Prior to DNA- analysis it is estimated that the crime could equally 
probably have been committed by a member of X± as by a member of X 2 , 
i.e., Pi = 02 = 0.5. Now a DNA-trace of the criminal is found, giving rise to 
a profile T. The forensic lab calculates p\ = 10~ 9 and p 2 = 10~ 8 . 
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7.2.1 Unconditioned on the profile 

The likelihood ratio ( 14. 9 p (taking both the fact that the criminal and the 
suspect have T as evidence) equals 1 / (pifli+p 2 fl 2 ) = 1.8x 10 8 . This likelihood 
ratio holds for any suspect s, as long as S is independent of C. 

With this likelihood ratio we obtain, for s G X\, posterior odds in favour 
of guilt equal to 

(pi/3i+p 2 /3 2 )- 1 /3i/iVi^9, 

corresponding to (cf. (14.61) ) P S {G \ I,E) = 0.9. For s G X 2 the posterior 
odds are 

(piA+^a)- 1 ^/^ w910, 
such that P S (G \I,E) = 0.999. 

7.2.2 Conditional on the profile 

Given the fact that C has V and the frequencies p±,P2, we can also first cal- 
culate P{C G Xi | I) = Oj. This gives a± = 0.09 and a 2 = 0.91: since the 
profile T is rarer in Xi, it is much more likely that the criminal is from X 2 . 
The odds on C belonging to X\ are ai/a 2 = 10. If this is taken as infor- 
mation relative to which everything else is conditioned, then the likelihood 
ratio associated to having T, is the inverse random match probability for the 
suspect: l/pi or l/p2- This gives rise to the same P S {G \ I,E): if s G X\ 
then the posterior odds are 

Pi X Q!i/(AM - ai) w ai/(N lPl ) = 0.09/(10" 9 10 7 ) = 9, 

as above. Similarly, for s 6 X2, we get posterior odds 

p^a 2 /(N2 - a 2 ) w a 1 /(N 2 p 2 ) = 0.91/(10- 8 10 5 ) = 910, 

as above. 

7.2.3 Consequences of errors 

When statements are made regarding the subpopulation to which C belongs, 
one has to be careful to note whether or not / has been taken into account. 
Indeed taking a>i equal to fy, that is, cti = a 2 = 0.5, we overestimate posterior 
odds in favour of guilt with a factor 10 for suspects from X\ and underes- 
timate them with the same factor for suspects from X 2 - This is a serious 
overestimate of the actual odds for suspects from X\. In this example, it 
leads to a posterior probability of guilt of 0.98 (instead of 0.90). 
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Finally, we note that if that the forensic lab assumes p 2 — Pi = 10~ 9 for 
both populations, e.g. because it always uses the population frequencies of 
the dominant population X±, then we arrive at a« = fa. The posterior odds 
in favour of guilt will in that case be calculated to be Pi 1 ai/(Ni — a^) ~ 
Oii/{p\Ni) for s G Xj. In this example, these odds are 50 for s G X\ and 5000 
for s G X 2 which is an overestimate in both cases. 

7.3 Subpopulations: general case 

We next illustrate the results that we have obtained for the case where the 
populations is heterogeneous w.r.t. T-probability, and there is uncertainty 
about the profile frequency in each population, as well as uncertainty about 
the subpopulation to which an individual belongs. This is described in section 
15.31 Since there are many parameters that can be varied, we will keep some 
of them fixed throughout. We assume that the population consists of three 
disjoint subpopulations Xi, X 2 , X 3 , where X x is the dominant one, and the 
others are much smaller. We set N ± = 20 ■ 10 6 , N 2 = 10 6 , N 3 = 10 5 and 
a = p/2. We will compare the true posterior probability of guilt P{G | /, E) 
with the probability obtained assuming that for X 2 , X% the same T-frequency 
Pi is used as for X\. This allows one to judge what the consequences are of 
having a subpopulation without knowing so. For example, there may be a 
region of the country with a relatively high T-frequency due to its relative 
isolation in the past. In practice it can be difficult to say with certainty if a 
given individual belongs to that subpopulation. 

We compute for several choices of pi, €i and fa the true probability of 
guilt and compare it to what one would obtain if p 2 ,p 3 would be ignored, 
namely (15.201) with N+l = N± + N 2 + N 3 and p = p\ . We denote this result 
with P hom (G | /, E) and call it the naive probability of guilt. 

Example 7.1. Let pi = 10~ 8 ,p 2 = 10~ 7 ,p 3 = 10~ 6 . We keep the fixed to a 
choice where it is 90% certain that S G Xi, not knowing / or E. The results 
are summarized in Table [TJ Notice that the true probability of guilt may be 



Table 1: Guilt probabilities for p\ = 10 8 ,p 2 = 10 7 ,P3 = 10 6 



(ei,e 2 ,e 3 ) 


(fa, fa, fa) 


P{G\I,E) 


P hom (G\I,E) 


(0.9,0.05,0.05) 


(0.999,0.0005,0.0005) 


0.50 


0.79 


(0.9,0.05,0.05) 


uniform 


0.70 


0.79 


(0.9,0.05,0.05) 


(0.99,0.005,0.005) 


0.74 


0.79 


(0.9,0.05,0.05) 


(0.9,0.05,0.05) 


0.84 


0.79 



smaller or greater than the naive probability. In the first line with fa = 0.999, 
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there is considerable uncertainty as to the subpopulation to which S belongs 
given I,E- in fact P S (S G X x \ I,E) = 0.40, P S (S G X 3 \ I,E) = 0.56. 
Since for this choice of parameters P S (G \ I,E, S G X 3 ) (given by (15.23P ) is 
only 0.32, we get a probability of guilt equal to 0.50, much smaller than the 
naive probability. However, as {3± decreases, so does P S (S G X\ \ I,E), and 
P S (S G X 3 | I,E) grows. In the last line of Table OQ P S (S G X 3 | I, E) is 
large (equal to 0.95), so the posterior probability of guilt is predominantly 
given by (15.231) applied to s G X 3 , which is 0.89 for these parameters. 

Example 7.2. As observed above, we obtain the same probabilities P(G \ 
/, E) (and of course, the same naive probability of guilt), when in the above 
example e and /3 are exchanged. The explanation for these probabilities is 
somewhat different. In the first line of Tabled] (now with e\ = 0.999), it is 
quite likely that S belongs to X\ given I,E; in fact P S (S G X\ \ I,E) = 

0. 79, P S (S G X 3 | J, E) = 0.21. Since for this choice of parameters P S {G \ 

1, E,S G Xi) (given by (I5.23P ) is only 0.40, we get a probability of guilt 
equal to 0.50, much smaller than the naive probability. However, exactly as 
for Example 17.11 as e\ decreases, so does P S (S G X\ \ I, E), and P S (S G X 3 \ 
I, E) grows. 

These examples show that the effect of having subpopulations can be 
considerable when the profile is more common among the smaller subpopu- 
lations, even when both S and C are likely issued from the largest subpopu- 
lation. The magnitude and the direction of the subpopulation effect depend 
strongly on the a priori probabilities for S and C to belong to each of the 
subpopulations. 

Example 7.3. Letting S and C be likely issued from X 2 or X 3 , we get a 
posterior probability of guilt between 0.80 and 0.85 which does not depend 
strongly on the precise choice of and Pi. This is understandable since 
these choices all make P(S G X\ \ I, E) small, and (I5.23j) applied to X 2 and 
X 3 yields 0.89 for both populations (note that they have the same expected 
number of T-bearers). 

Example 7.4. Consider the case where p 2 and p 3 are smaller than p%, for 
example p\ = 10^ 8 ,p 2 = 10~ 9 ,p3 = 10 -10 . The population as a whole then 
has a smaller number of expected T-bearers compared to when p 2 = p 3 = P\- 
The true probability of guilt exceeds the naive probability unless one is almost 
sure that S and C are from different subpopulations, as illustrated in Table 

El 
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Table 2: Guilt probabilities for pi = 10 s ,p 2 = 10 9 ,p 3 = 10 



(ei,e 2 ,e 3 ) 


(ft, A,, ft) 


P(G\I,E) 


P hom (G\I,E) 


uniform 


uniform 


0.80 


0.79 


(0.9,0.05,0.01) 


(0.9,0.05,0.05) 


0.80 


0.79 


(0.2,0.6,0.2) 


(0.2,0.6,0.2) 


0.98 


0.79 


(0.1,0.3,0.6) 


(0.3,0.3,0.4) 


0.97 


0.79 


(0.9,0.09,0.01) 


(0.01,0.01,0.98) 


0.88 


0.79 


(0.99,0.009,0.001) 


(0.001,0.001,0.998) 


0.57 


0.79 



7.4 T-correlation: relatedness 

Suppose that C has DNA-profile V with a population frequency of 10~ 7 , i.e., 
p x = 10~ 7 for all x G X. Now we select s from X, and T s = 1. Suppose 
that X = {s, yi, y 2 , 2/3, Zi, . . . , z N } and iV = 10 6 , such that c yi<s = 10~ 3 
and c ZiyS = p Zi = 10~ 7 . Here we model a situation in which the suspect 
s has three brothers, whose Improbability is 10 -3 given T s = 1, and that 
the rest of the population is unrelated to s. If the a priori probabilities are 
P S (C = s I I) = 0A,P s (C = yi \I) = 0.1, P S (C = Zj I /) = 0.3/10 6 then the 
correction factor (13.61) for the T-correlation is equal to 

0.6 0.6 1 



3 • 10 4 • 0.1 + 10 6 • 1 • 0.3 • 10- 6 0.3 + 3000 5000 ' 

meaning that the likelihood ratio associated to Y s — 1 has been made 5000 
times smaller, reducing it from l/p s = 10 7 to 2000. 

For the posterior probability of guilt P S (G \ I,E), this means that it is 
reduced from 

|10 7 3 7 

3 ^1 1 n— 7 



1 - -10" 



1 + flO 7 2 



that we would obtain without T-correlation, to approximately 1-3/4000. 



7.5 Biased search 

We recast the example given in Section 17.41 in the setting of a biased search, 
to demonstrate the equivalence noted in Section 13.31 As in I7.4[ p x = 10 -7 
for all x G X, we suppose that S = s has been selected and that T s = l,and 
that there are 1/1,1/2,1/3 G X such that a x , Ui = lOV^ for i = 1,2,3 and 
&x,zi — &x,x for all i. The prior odds are as in Section [7741 Then the likelihood 
ratio associated to the evidence T s = 1 is reduced by a factor of about 5000, 
as in Example 17.41 In that example, the value was decreased since finding T 
in s made it more probable that population members y\, 1/2, 2/3, which have 
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non-negligible prior probabilities of guilt, also have T. In this situation, it is 
due to the fact that the selection procedure is such that if s is selected, it 
becomes less likely that s is guilty: 



S„ 6 A- VT,P(C = V I /) 7500' 

which is considerably less than 0.4. The fact that s has T then raises the 
probability of guilt to approximately 1 — 3/4000 as above. 



7.6 Database effectiveness 

In (16.61) we have computed the odds in favour of a unique database match 
being with the true criminal. If the database is a random sample of the 
population in the sense that P{C e T> \ I) — \V\/\X\ = n/N, then this 
equation reads 

P(S = C\ E U I) _ 1 
P(S^C | E U I) ~ p(N - n) ' 

which is monotonically increasing in n, going from 1/((N — l)p) for n — 1 to 
infinity for n = N . It is not hard to derive this directly: since n — 1 persons 
have been shown not to possess T, the population that can not be excluded 
has size N — n + 1. In that population, only the T-status of one individual 
(the one that matched in T>) is known. Since T> was a random sample as 
defined above, the classical solution (12.41) applies. 

If the database is not a random sample from the population in the above 
sense, then the situation is more interesting and quite different. 

Example 7.5. Let p = 10~ 7 and suppose that with n = 10 5 one has P(C G 
V | I) = 0.2. For example, this may be because the database consists of 
previously convicted individuals and based on the probability of a rightful 
conviction and of recidivism one arrives at such an estimate. For database 
V, the odds that a unique match is with C are 25 to one, or equivalently, 
P(S = C\ I,E 1 ) = 0.96. 

It may be possible to enlarge T> to T>' with \D'\ = n' such that P(S = C \ 
I, Ei) = 0.5, but only at the cost of adding very many individuals into V, 
e.g. with n' — 2 ■ 10 6 . In that case, the odds (16. 6p on a unique match being 
with the offender decrease to 5, i.e., one in six of such matches will be with 
an innocent person. 

The probability of actually obtaining a unique match is given by 

P{E X ) = P(C e V | /)(! -p) n + P{C i V | I)np{l - p) n -\ 
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For database T>, this evaluates to 0.206 and for database T>' to 0.491%. 
Thus, in T>' a search with a DNA-profile with population frequency 10~ 7 will 
yield a unique match about half of the time, but only 5 out of 6 of these will 
be with the true offender. About 10% of such searches will result in two or 
more hits, and about 40% will not result in any hit. 

When multiple matches are found, it is more likely that one of them is 
with the true offender but not a near certainty: e.g., in case two matches 
are found (which happens with probability 0.09), about one in ten of such 
double matches are both coincidental. 

For the original database V, about 20,6% of searches result in a unique 
match, almost all of which are with the offender; in the remaining cases one 
almost always has no hits: the probability of having more than one match 
being 0.002. 

Example 7.6. Suppose that the database is set up and expanded such that if 
it has size n then P(C £ V) = \/n/N '. This is a model for a database in which 
individuals with higher prior probability of guilt are put in the database with 
higher probability. For example, if T> contains the DNA-profiles of 10% of 
the population, then it contains C with probability 0.31. If T> is enlarged to 
contain 30% of the population, then it contains C with probability 0.55. 

In that case, the odds on a unique match being with the criminal in a 
database of size n are minimal for n = N/4. An example for N = 2 - 10 7 ,p = 
10 -8 is given in Figure [U With n = N/4 the odds in favour of a unique 
match being with the true offender are 20. As the plots show, when the 
database is relatively small the odds on a match being with the true offender 
decrease rapidly, e.g. from 105 if n = 50.000 to 55 if n = 200.000. As n 
grows further, the odds decrease (slowly) to 20 for n = 5 ■ 10 6 = N/4. When 
n grows further, the odds increase again. When 50% of the population is 
included (n = 10 7 ), they are 24. 

Thus, enlarging a database may at the same time increase the chance of 
obtaining a unique match from it, and diminish the value of such a match 
in the sense that the probability of it being with the true offender decreases. 
These examples suggest that the idea that the larger the database, the better, 
needs to be put into perspective. It is of course true that enlarging a database 
increases the probability that the criminal is included. It is also obvious 
that given a unique match in the database, the probability that it is with 
the criminal increases when the database is expanded and does not yield 
additional matches. But as we have seen, it does not follow that hits in larger 
databases are stronger evidence for guilt than hits in smaller databases. 
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